198
14
The Nature of Living Things
to explore (via mutations) neighbouring (in sequence space) genomes. Hence, bioin-
formatics (applied to genomics) needs a higher level theory than that provided by
existing information theory. An important, although long-term, task of bioinformat-
ics is to determine how biological genomes are chosen such that they are suited to
their tasks, encompassing such aspects.
Unreliable DNA polymerase is a distinct advantage for producing new antibodies
(somatic hypermutation) and for viruses needing to mutate rapidly in order to evade
host defences—provided it is not too unreliable: Eigen (1976) has shown that in a
soup of self-replicating molecules, there is a replication error rate threshold above
which an initially diverse population of molecules cannot converge onto a stable,
optimally replicating one (a quasi-species 45).
Problem. What are the implications of a transcription error rate estimated as 1 in10 Superscript 5105?
(In contrast, the error rate of DNA replication is estimated as 1 in 10 Superscript 101010.) Calculate
the proportion of proteins containing the wrong amino acids due to mistakes in tran-
scription, assuming that translation is perfect. Compare the result with a translation
error rate estimated as 1 in 3000.
Problem. Explore the suggestion that the quality of a channel (such as a telephone
line) is independent of the actual message.
14.7.3
Recombination
Homologous recombination is a key process in
genetics, whereby the rearrange-
ment of genes can take place. It involves the exchange of genetic material between
two sets of parental DNA during meiosis (Sect. 14.4.1). The mechanism of recogni-
tion and alignment of homologous (i.e., with identical, or almost identical, nucleotide
sequences) sections of duplex (double-stranded) DNA is far less clear than the recog-
nition between complementary single strands; it may depend on the pattern of elec-
trostatically charged (ionized) phosphates, which itself depends slightly but probably
sufficiently on sequence, and can be further modulated by (poly)cations adsorbed on
the surface of the duplex. 46
Following the alignment, the breakage of the DNA takes place, and the broken
ends are then shuffled to produce new combinations of genes; for example, consider
a hypothetical replicated pair of chromosomes, with the dominant gene written in
45 A quasi-species may be defined as a cluster of genomes in sequence space, the diameter of the
cluster being sufficiently small such that almost every sequence can “mate” with every other one
and produce viable offspring. The sequence at the centre of the cluster is called the master sequence.
If the error rate is above the threshold, in principle all possible sequences will be found. See also
Sect. 4.1.2.
46 Kornyshev and Leikin (2001).